A Risk Minimization Framework for Extractive Speech Summarization
نویسندگان
چکیده
In this paper, we formulate extractive summarization as a risk minimization problem and propose a unified probabilistic framework that naturally combines supervised and unsupervised summarization models to inherit their individual merits as well as to overcome their inherent limitations. In addition, the introduction of various loss functions also provides the summarization framework with a flexible but systematic way to render the redundancy and coherence relationships among sentences and between sentences and the whole document, respectively. Experiments on speech summarization show that the methods deduced from our framework are very competitive with existing summarization approaches.
منابع مشابه
An Empirical Comparison of Contemporary Unsupervised Approaches for Extractive Speech Summarization
Due to the rapid-developed Internet and with the big data era coming, the automatic summarization research has been emerged a popular research topic. The aim of automatic summarization is in attempt to select important text or spoken sentence to represent the topic (theme) of original text or spoken document according to a predefined summarization ratio. In this study we frame automatic summari...
متن کاملExtractive speech summarization - from the view of decision theory
Extractive speech summarization can be thought of as a decision-making process where the summarizer attempts to select a subset of informative sentences from the original document. Meanwhile, a sentence being selected as part of a summary is typically determined by three primary factors: significance, relevance and redundancy. To meet these specifications, we recently presented a novel probabil...
متن کاملEffective pseudo-relevance feedback for language modeling in extractive speech summarization
Extractive speech summarization, aiming to automatically select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has become an active area for research and experimentation. An emerging stream of work is to employ the language modeling (LM) framework along with the Kullback-Leibler divergence measure for extractive spe...
متن کاملSemi-supervised extractive speech summarization via co-training algorithm
Supervised methods for extractive speech summarization require a large training set. Summary annotation is often expensive and time consuming. In this paper, we exploit semi-supervised approaches to leverage unlabeled data. In particular, we investigate co-training for the task of extractive meeting summarization. Compared with text summarization, speech summarization task has its unique charac...
متن کاملCombining Graph Degeneracy and Submodularity for Unsupervised Extractive Summarization
We present a fully unsupervised, extractive text summarization system that leverages a submodularity framework introduced by past research. The framework allows summaries to be generated in a greedy way while preserving near-optimal performance guarantees. Our main contribution is the novel coverage reward term of the objective function optimized by the greedy algorithm. This component builds o...
متن کامل